Mixed Script Ad hoc Retrieval using back transliteration and phrase matching through bigram indexing: Shared Task report by BIT, Mesra
نویسندگان
چکیده
This paper describes an approach for Mixed-script Ad hoc retrieval, a subtask as part of FIRE 2015 Shared Task on Mixed Script Information Retrieval. We participated in subtask 2 of the shared task, where a statistical model was used to carry out back transliteration to Devanagari script. To perform the search, bigram based index of the documents were used and search was performed using pivot terms in the query.
منابع مشابه
DA-IICT in FIRE 2015 Shared Task on Mixed Script Information Retrieval
This paper aims to describe the methodology followed by Team Watchdogs in their submission for the shared task on Mixed Script Information Retrieval (MSIR) in FIRE 2015. I participated in the subtask 1 (Query Word Labelling) and 2 (Mixed-script Ad hoc retrieval). For subtask 1, Machine Learning approach using CRF classifier was used to classify the tokens as one of the possible languages using ...
متن کاملEncoding transliteration variation through dimensionality reduction: FIRE Shared Task on Transliterated Search
There exist a large amount of user generated Web content in Roman script for the languages which are written in indigenous scripts for various reasons. In the light of this phenomenon, the search engines face a non-trivial problem of matching queries and documents in transliterated space where transliterated content contain extensive spelling variation. This paper describes our proposed method ...
متن کاملImproving English and Chinese Ad-Hoc Retrieval: TIPSTER Text Phase 3 Final Report
We investigated both English and Chinese ad-hoc information retrieval (IR). Part of our objectives is to study the use of term, phrasal and topical concept level evidence, either individually or in combination, to improve retrieval accuracy. For short queries, we studied five term level techniques that together lead to improvements over standard ad-hoc 2-stage retrieval some 20% to 40% for TREC...
متن کاملA Relevance feedback based approach for mixed script transliterated text search: Shared Task report by BIT Mesra, India
This paper describes the experiments carried out as part of the participation in FIRE-2014 Transliterated Search Shared task. We participated in subtask-2 and submitted two results generated by systems based on relevant feedback approach. Given a collection of documents in mixed script, the task is to retrieve relevant documents using queries in either script. The spelling variation between dif...
متن کاملName Transliteration with Bidirectional Perceptron Edit Models
We report on our efforts as part of the shared task on the NEWS 2009 Machine Transliteration Shared Task. We applied an orthographic perceptron character edit model that we have used previously for name transliteration, enhancing it in two ways: by ranking possible transliterations according to the sum of their scores according to two models, one trained to generate left-to-right, and one right...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015